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ABSTRACT 



A system and method for receiving a transported stream of 
data packets includes a buffer management device for 
receiving the data packets, unpacking the data packets, and 
forwarding a stream of data frames. The system and method 
further includes a first jitter buffer for receiving the data 
frames from the buffer management device and buffering the 
data frames, and a second jitter buffer for receiving the data 
frames from the buffer management device and buffering the 
data frames. In addition, the system and method includes a 
computationally-desirable jitter buffer selected from the first 
jitter buffer or the second jitter buffer by comparing a first 
jitter buffer quality and a second jitter buffer quality. The 
system and method also includes a decoder for receiving 
buffered data frames from the computationally-desirable 
jitter buffer. 

31 Claims, 13 Drawing Sheets 
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SYSTEM FOR REAL TIME 
COMMUNICATION BUFFER 
MANAGEMENT 

This is a continuation-in-part application of patent appli- 
cation Ser. No. 08/942,446, entitled "Method and Apparatus 
for Real Time Communication Over Packet Networks," filed 
Oct. 1, 1997, now U.S. Pat. No. 6,175,871 and specifically 
incorporated in its entirety herein by reference. This is also 
a continuation-in-part application of U.S. patent application 
Ser. No. 09/241,689, entitled "System for Dynamic Jitter 
Buffer Management Based on Synchronized Clocks/' filed 
on Feb. 2, 1999, now U.S. Pat. No. 6,360,271 and specifi- 
cally incorporated in its entirety herein by reference, 

REFERENCE TO COMPUTER PROGRAM 
LISTING APPENDICES SUBMITTED ON 
COMPACT DISC 

The originally filed specification for the present applica- 
tion included Appendices A-C, which contained paper print- 
outs of several computer program listings and an output file. 
Two compacts discs containing electronic text copies of the 
computer program listings and output file of Appendices 
A-C have been submitted for the present application. These 
electronic copies of Appendices A-C have been labeled with 
the appropriate identification for this application, and one of 
the compact discs has been labeled "Copy 1," while the 
other has been labeled "Copy 2." The compact disc labeled 
"Copy 2" is identical to the one labeled "Copy 1," and both 
compact discs are specifically incorporated herein by refer- 
ence. 

Each of the submitted compact discs is formatted for a PC 
type workstation with an MS-Windows based operating 
system, and includes the serial label number of 011129__ 
1352. The following is a list of the folders and files on each 
of the two submitted compact discs: 

Folder — Appendix A 

File--buffer__mgmt.cc.txt (Size: 9 KB; Dated: Nov. 29, 

2001) 
Folder — Appendix B 

File — VoIP Output File.txt (Size: 7 KB; Dated: Nov. 29, 

2001) 
Folder — Appendix C 

File— buffer.h.txt (Size: 7KB; Dated: Nov. 29, 2001) 
File— voicebuffer.cc.txt (Size: 13 KB; Dated: Nov. 29, 
2001) 

File— voicebufler.h.txt (Size: 3 KB; Dated: Nov. 29, 
2001) 

COPYRIGHT NOTICE AND AUTHORIZATION 

A portion of the disclosure of this patent document 
contains material that is subject to copyright protection. The 
copyright owner has no objection to the facsimile reproduc- 
tion by anyone of the patent document or the patent 
disclosure, as it appears in the Patent and Trademark Office 
patent file or records, but otherwise reserves all copyright 
rights whatsoever. 

BACKGROUND OF THE INVENTION 
A. Field of the Invention 

This invention relates to the field of telecommunications 
and more specifically to a method and apparatus for choos- 
ing buffer size and error correction coding for real time 
communication over packet networks. 
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B. Description of Related Art and Advantages of the 
Invention 

Real time communications such as audio or video can be 
encoded using various compression techniques. The 

5 encoded information can then be placed in data packets with 
time and sequence information and transported via non- 
guaranteed Quality of Service (QoS) packet networks. Non- 
guaranteed packet switched networks include a Local Area 
Network (LAN), Internet Protocol Network, frame relay 

10 network, or an interconnected mixture of such networks 
such as an Internet or Intranet. One underlying problem with 
non-guaranteed packet networks is that transported packets 
are subject to varying loss and delays. Therefore, for real- 
time communications, a tradeoff exists among the quality of 

15 the service, the interactive delay, and the utilized bandwidth. 
This tradeoff is a function of the selected coding scheme, the 
packetization scheme, the redundancy of information pack- 
eted within the packets, the receiver buffer size, the band- 
width restrictions, and the transporting characteristics of the 

20 transporting network. 

One technique for transporting real time communication 
between two parties over a packet switched network requires 
that both parties have access to multimedia computers. 
These computers must be coupled to the transporting net- 

25 work. The transporting network could be an Intranet, an 
Internet, a wide area network (WAN), a local area network 
(LAN), or other type of network utilizing technologies such 
as Asynchronous Transfer Mode (ATM), Frame Relay, Car- 
rier Sense Multiple Access, Token Ring, or the like. As in the 

30 case for home personal computers (PCs), both parties to the 
communication may be connected to the network via tele- 
phone lines. These telephone lines are in communication 
with a local hub associated with a central office switch and 
a Network Service provider. As used herein, the term "hub" 

35 refers to an access point of a communication infrastructure. 
This communication technique however, has a number of 
disadvantages. For example, for a home-based PC connected 
to a network using an analog telephone line, the maximum 

4Q bandwidth available depends on the condition of the line. 
Typically, this bandwidth will be no greater than approxi- 
mately 3400 Hz. A known method for transmitting and 
receiving data at rates of up to 33,6 kbits/second over such 
a connection is described in Recommendation V.34, pub- 

4S lished by the International Telecommunication Union, 
Geneva, Switzerland. 

Aside from a limited bandwidth, various delays inherent 
in the PC solution, such as sound card delays, modem 
delays, and other related delays are relatively high. 

50 Consequently, the PC-based communication technique is 
generally unattractive for real-time communication. As used 
herein, "real-time communication" refers to real-time audio, 
video, or a combination of the two. 

Another typical disadvantage of PC-based 

55 communication, particularly with respect to PC-based tele- 
phone communications, is that the communicating PC 
receiving the call generally needs to be running at the time 
the call is received. This may be feasible for a corporate PC 
connected to an Intranet. However, such a connection may 
be burdensome for a home based PC, since the home PC 
may have to tie up a phone line. 

Another disadvantage is that a PC-based conversation is 
similar to conversing over a speakerphone. Hence, privacy 
of conversation may be lost. Communicating over a speak - 

65 erphone may also present problems in a typical office 
environment having high ambient noise or having close 
working arrangements. 
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In addition, PC-based telephone systems often require a buffer management device for receiving the data packets, 

powerful and complex voice encoders and therefore require unpacking the data packets, and forwarding a stream of data 

a large amount of processing capability. Even if these frames. The gateway also comprises a first jitter buffer for 

powerful voice encoders run on a particularly powerful PC, receiving the data frames from the buffer management 

the encoders may slow down the PC to a point where the 5 device and buffering the data frames, and a second jitter 

advantage of document sharing decreases, since the remain- bu ff er for rcC eiving the data frames from the buffer man- 

ing processing power may be insufficient for a reasonable a gement device and buffering the data frames. In addition, 

interactive conversation. Consequently, a i caller may ^ave to lhe gateway a computationally-desirable jitter 

use less sophisticated encoders, thereby degrading the qual- buffer from me fifSt jiUer buffef 0f lhe jiUer 

lty ot the call. 1Q buffer by comparing a first jitter buffer quality and a second 

A general problem encountered in packet switched ..^ buflfer n Additionally, the gateway comprises a 

networks, however, is that the network P ®* lose data decodef for receivi buffered data frames from the 

packets. Packets may also be delayed during transportation compulationally . desira | le jitter buffer . 

from the sender to the receiver. Therefore, some or the r 3 J 

packets at a receiving destination will be missing and others The present invention also provides a method for receiv- 

will be "jittered" and therefore arrive out of order. 15 ing a transported stream of data packets comprising the steps 

In a packet switched network whose transporting charac- of receiving the data packets at a management module, and 

teristics vary relatively slowly, the immediate past transport- unpacking the data packets at the management module. The 

ing characteristics can be used to infer information about the method also comprises the steps of forwarding a first stream 

immediate future transporting characteristics. The dynamic of data frames to a first jitter buffer, and forwarding a second 

network transporting characteristics may be measured using 20 stream of data frames to a second jitter buffer. Moreover, the 

such variables as packet loss, packet delay, packet burst loss, method comprises the steps of buffering the data frames at 

loss auto-correlation, bandwidth, and delay variation. the first jitter buffer and the second jitter buffer, computing 

IP gateways, such as IP telephony receivers, may employ a first jitter buffer quality for the first jitter buffer, and 

a configuration of computational buffers or jitter buffers to computing a second jitter buffer quality for the second jitter 

mask network-induced expansion and contraction of packet 25 bu ffer. The method further comprises the steps of selecting 

inter-arrival times. Although IP telephony transmitters may either the first or the second jitter buffer as a 

send packets with deterministic inter-departure times, IP computationally-desirable jitter buffer based on the first and 

networks such as the Internet will "jitter" (i.e., introduce second jitter buffer qualities, and forwarding the buffered 

delay variance) and lose packets as the packets are trans- data frames from the computationally-desirable buffer to a 

ported through some number of switches and routers before 30 decoder. 

the packets arrive at the IP gateway, such as the IP telephony In add i t ion, the present invention provides a receiver for 

receiver. The greater the jitter buffer depth, the more jitter receiving a transported stream of data packets. The receiver 

that the communication channel can mask. comprises a buffer management device for receiving the data 

If packet arrivals are highly skewed with respect to buffer packets, unpacking the data packets and forwarding a stream 

depth, packets may be lost due to buffer overflow or buffer 35 of daU f rames rece j V er also comprises a buffer array 

underflow. However, due to the interactive nature of real having a computationally-desirable buffer and a plurality of 

time communication over IP, particularly IP telephony, it is virtual buffers, with each buffer of the buffer array receiving 

desirable to introduce as little jitter buffer latency as pos- ancJ bu ff er i ng the data frames from the buffer management 

sible. Therefore, a buffer having a shallow depth is generally device. The receiver further comprises a decoder for receiv- 

desired. IP telephony end-user quality of service is also 40 i ng bu ff er ed data frames from the computationally-desirable 

degraded by packet loss introduced by the IP network itself. buffer. 
For example, an intermediate IP router in between the source 

and destination of the real-time communication may become BRIEF DESCRIPTION OF TOE DRAWINGS 

temporarily overloaded, and as a result will drop (Le embodiments of the t inveQtion are 

delete) packets in response to tne congestion, inis pacKet « hefein ^ 

to the drawings, in which: 

loss causes audible clicks, pops, and gaps in a voice . . ° 

conversation, degrading the quality of the conversation. 1 pirates a preferred communication channel of 

Some packet loss may be masked through error-correction »he present invention including a sender and a receiver, 

coding. Such error correction coding techniques may FIG * 2 shows a preferred format for a data packet used 

include frame replication (i.e., frame redundancy) or frame- 50 with the communication channel of FIG. 1. 

based forward error correction (FEQ. For example, related FIG* 3 shows a preferred format for a frame field of the 

U.S. Pat. No. 5,870,412 entitled "Forward Error Correction data packet of FIG. 2, with a redundancy value of three. 

System for Packet Based Real Time Media" describes a FIGS. 4A— 4B show preferred formats for frame fields of 

forward error correction code scheme for transmission of the data packet of FIG. 2, with a first and a second forward 

real time media signals and is fully herein incorporated by 55 error correction scheme, respectively, 

reference and to which the reader is directed for additional FIG. 5 provides a flowchart illustrating a packet arrival 

details. One disadvantage of utilizing techniques such as function of a buffer management module of the receiver of 

redundancy or FEC, however, is that they may increase the FIG. 1. 

amount of information required per data packet and there- p IG 6 prov ides a flowchart illustrating a time out function 

fore the amount of required bandwidth. There is, therefore, 60 of a j iUcr buffcr of the recc ; vcr Q f piG 1 

a general need for an IP gateway that can dynamically adjust nG ? ides a flowcharl iUuslralin a la frame 

receiving properties based, in part, on dynamic transporting of & ..^ ^ flf ihe receiver Qf nG j 

characteristics while also attempting to optimize bandwidth. ___ . , ... . . - . 

FIG. 8 provides a flowchart illustrating an arrival function 

SUMMARY OF THE INVENTION 65 0 f a jitter buffer of the receiver of FIG. 1. 

The present invention provides a gateway for receiving a FIG. 9 provides a flowcharl illustrating an insert function 

transported stream of data packets. The gateway comprises of a jitter buffer of the receiver of FIG. 1. 
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FIG. 10 provides a flowchart illustrating an insert missing Recommendation G.723.1 describes a code excited linear 

function of a jitter buffer of the receiver of FIG. 1. predictive encoder (CELP). Recommendation G.723.1 

FIG. 11 provides a graphical representation for selecting specifies a coded representation used for compressing 

a computationally-desirable jitter buffer from the receiver speech or another audio signal component of multimedia 

set shown in FIG. 1. 5 services at a low bit rate as part of the overall H.324 family 

of standards. Recommendation G.723.1 is entitled "DUAL 

DESCRIPTION OF PREFERRED RATE SPEECH ENCODER FOR MULTIMEDIA COM- 

EMBODIMENTS MUNICATIONS TRANSMITTING AT 5.3 & 6.3 KBITS/ 

™ 0 * .„ t , c„a • . fll S" and is published by the Telecommunication Standardiza- 

FIG. 1 illustrates a preferred communication channel 500 _ r r , _ , . _ _„ ^ . , 

Cl . , . # . • . i enA nan io tion Sector of the ITU. Recommendation G.723.1 is herein 

of the present invention. Communication channel 500 gen- ... e , . • 

,, r . eA - t ww eic entirely incorporated by reference and to which the reader is 

e rally comprises a sender 502, a transporting medium 535, J , . *: . .\ ., AIi . 

. ' : n« n em „ • . lfr directed for further details. Alternatively, voice encoders 

and a receiver 510. Receiver 510 comprises a Duller man- , . . , , , , :Z 

i i en . rr „ , d/i „ „ r complying with other standards or specifications, such as 

agement module 512, a buffer array 514 containing a set of ™ T V a ,* ^th u .u f u- u 

•Z t. «■ j j j - 1W A i • f. „ ITU Recommendations G.711 or G.729 A, both of which are 

utter buffers 516. and a decoder 518. As explained in more * • . . . . . 

j , ■, . i . cue « 15 specifically incorporated in their entirety herein by 

detail below, each jitter butter 516 may be a ^ k h 

computationally-desirable or real jitter buffer that actually "^rence, may be used. 

sends information to the decoder 518, or alternatively, a Preferably, the digital signal 583 sent to the encoder 580 

virtual jitter buffer that does not send information to the is digital speech representation sampled at 8000 Hz. Each 

decoder 518. It will be appreciated that the input and output sample of the digital signal 583 is represented by a signed 16 

media may alternatively engage in interactive 20 bit integer. The encoder 580, preferably a G.723.1 encoder, 

communication, in which case the scenario depicted in FIG. segments the digital signal 583 into one or more frames 585. 

1 could be revised to be symmetric. In that case, for instance, Preferably, the first byte of each frame 585 indicates the 

the transmitter or sender would also perform the functions of number of bytes in the frame, while the remainder of each 

a receiver, such as receiver 510, and receiver 510 would also frame 585 contains the segmented digital signal 583. In 

perform the functions of a transmitter, such as sender 502. 25 addition, with a G.723.1 encoder, each frame is preferably 

Further, the principles described herein could be applied in 30 milli-seconds (ms) in length. Thus, at the preferred 

either or both directions, such as for an interactive telephone sampling rate of 8000 Hz, 30 ms represents 240 samples, 

conversation Moreover, in a group of frames, the frames are preferably 

Exemplary receiver 510 comprises a single array of jitter arran S ed in decreasing sequential order, 
buffers. Alternatively, receiver 510 includes more than one The preferred G.723.1 encoder can operate at two differ- 
array of jitter buffers. In such an alternative embodiment, the erit bit rates, a low rate of 5.3 kbits/seconds or a high rate of 
various sets of jitter buffers may have common character- 6.3 kbits/seconds. In the high rate setting of 6.3 kbit/s, 480 
istics with one another. For example, a receiver may contain bytes (i.e., 240 samples times 2 bytes/sample) are corn- 
three sets of jitter buffers wherein the first set of jitter buffers pressed to 24 bytes. In this high rate setting, where the input 
utilize error correction coding, the second set of buffers signal 572 is voice, the encoding results in a quality that is 
utilize redundancy, and the third set may utilize a large close to toll quality. In the low rate setting of 5.3 kbits/s, 480 
buffer depth without error correction or redundancy. For bytes are compressed to 20 bytes. Therefore, between the 
further details, the reader is directed to the method and low and high rate setting, the compression ratio varies from 
apparatus for selecting buffer size and error correction 4Q 20 to 24. 

coding for real time communication disclosed in United Preferably, encoder 580 utilizes silence detection. Silence 

States patent application Ser. No. 09/322,836, now U.S. Pat. begins at the end of a talk spurt or burst, which includes the 

No. 6,366,959, entitled "Method and Apparatus for Real one or more frames that make up the digital signal (i.e., 

Time Communication System Buffer Size and Error Cor- digital speech representation). Silence ends at the beginning 

rection Coding Selection," commonly assigned with the 45 of the next talk spurt or burst. The G723.1 silence detection 

present invention, and specifically incorporated in its uses a special frame entitled Silence Insertion Descriptor 

entirety herein by reference. (SID) frame. SID frame generation is described in Recom- 

Cbmmunication channel 500 and its method of operation mendation G723.1, which has been herein entirely incorpo- 

will now be described with reference to FIG. 1. A calling rated by reference and to which the reader is directed for 

device 570 generates a real time media input signal 572, 50 further details. During a "silence," as that term is used 

preferably a telephone call. Alternatively, the input signal herein, no voice data frames are generated by the encoder 

572 is video, multimedia, a streaming application, or a 580. An SID frame defines when a silence begins, preferably 

combination thereof. The input signal 572 is communicated at the end of a talk spurt or burst. After encoder 580 

to an analog-to-digital (A/D) converter 582. The A/D con- transmits an SID frame, no further voice data frames are 

verter 582 converts the input signal 572 to a digital signal S5 transmitted until the current silence ends. Updated SID 

583. Preferably, where the input signal 572 is a telephone frames may, however, be sent. 

call, the digital signal 583 is digital speech representation. One advantage of this silencing technique is that it 
The digital signal 583 is communicated to an encoder 580 reduces the required overall transfer rate. Moreover, silence 
of sender 502. In the case of a telephone call, digital signal detection allows for the periodic and independent evaluation 
583 is communicated to the encoder 580 over a telephone 60 of each of the jitter buffers contained in buffer array 514 of 
line. The digital signal 583 (preferably in Pulse Code receiver 510. Communication channel 500 can thereby peri- 
Modulated (PCM) form) is compressed and partitioned by odically monitor the varying transportation characteristics of 
encoder 580 into a sequence of frames 585. In other words, network 535. Consequently, the channel may alter which 
encoder 580 encodes digital signal 583. . jitter buffer of array 514 that the channel uses during a 

Preferably, in the case where the communication channel 65 specific time period of media playout. 

500 is used to communicate voice, encoder 580 is an ITU Packetizer 590 packets the frames 585 into a plurality of 

voice encoder complying with Recommendation G.723.1. data packets 592, which are in turn ordered in a data packet 
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sequence 508 and transported by the transporting network desirable or real jitter buffer. Any extra redundant frames 

535 to the receiver 510. FIG. 2 shows a preferred format that need to be created may contain arbitrary data, since 

created by the packetizer 590 for the data packets 592. these extra redundant frames will not get played out by the 

Preferably, packetizer 590 places a sequence number 592a virtual jitter buffers. It should also be understood that a 

and a time stamp 592£> into each data packet 592 in front of 5 redundancy value of one is the equivalent of no redundancy 

a frame field 592c containing one or more frames 585. The being used for error correction coding, since only the 

sequence number 592a identifies data packet ordering, and original frame n is being transmitted with the data packet n. 

is preferably comprised of four bytes. The time stamp 592b FIGS. 4A-4B shows two frame fields 592c for data 

identifies the time a specific data packet 592 was created, packets 592 with two different FEC schemes. Once again, 

and is preferably comprised of eight bytes, with the first four ]0 the first byte 585a of each frame 585 in the frame field 592c 

bytes containing the number of seconds since 12 a.m., Jan. preferably indicates the number of bytes in the frame (e.g., 

1, 1970, and the second four bytes contain the number of 20 bytes for 5.3 Kbps G.723.1), while the remainder of the 

microseconds within the current second (i.e., 0-999,999). It frame contains a segmented voice signal 5856. In contrast 

should be understood, however, that these sequence number with redundancy, there is preferably only one frame 585 in 

and time stamp formats are merely exemplary, and other 1S the frame field 592c, namely the original frame (frame n) 

sequence number and time stamp formats may be used with being transmitted with data packet n. An FEC frame 585c, 

the present invention, * however, is preferably appended to the original frame n. The 

As shown in FIG. 2, the format for the data packets 592 FEC frame 585c may have any number of different schemes, 

may also include a Protocol/Message (P/M) byte 5924 with including, but not limited to, a first scheme and a second 

the protocol portion being the four high -order bits, and the 2 o scheme. The first scheme is preferably indicated by redun- 

message portion being the four low-order bits. Preferably, dancy bits "1000," and contains the exclusive bitwise OR of 

the only valid values for the protocol and message portions the frame n-1 originally transmitted with the preceding 

are "0000." Similarly, the data packet format may further packet n-1, and the frame n-2 originally transmitted with 

include a Spare/Redundancy (S/R) byte 592e, with the spare the packet n-2 that preceded the packet n-1 (i.e., frame n-1 

portion being the four high-order bits, and the redundancy 2 s XOR frame n-2), as shown in FIG. 4A. The second scheme 

portion being the four low-order bits. While the spare bits is preferably indicated by redundancy bits "1001," and 

may be used for performance evaluation purposes to allow contains the exclusive bitwise OR of the frame n-1 origi- 

the sender 502 to piggyback an end -of -transmission token nally transmitted with the preceding packet n-1, and the 

onto the last frame 585 that will be transmitted in a frame n-3 originally transmitted with the packet n-3 that 

conversation, the spare bits are preferably ignored by the 30 preceded the packet n-2 (i.e., frame n-1 XOR frame n-3), 

receiver 510. On the other hand, the redundancy bits "0000" as shown in FIG. 4B. As known in the art, FEC frames and 

through "0111" may be used to indicate the number of schemes may be used to recover frames that were lost during 

frames 585 in the frame field 592c of the data packet 592. transmission through a transporting network. In addition, 

The redundancy bits "1000" through "1111," however, may while the FEC frame is preferably appended to the original 

be used to indicate different Forward Error Correction (FEC) 35 frame n of the data packet n, it should be understood that the 

schemes, as discussed in more detail below, FEC frame may be sent as part of a separate packet (i.e., an 

As noted above, the data packets 592 may also include FEC packet), 

error correction coding, such as redundancy or FEC. FIG. 3 As discussed in more detail below, each of the jitter 

shows a frame field 592c for a data packet 592 with buffers in the buffer array 514 may use a different FEC frame 

redundancy set to three. The first byte 585a of each frame 40 scheme than the other jitter buffers. In order to evaluate the 

585 in the frame field 592c indicates the number of bytes in performance of the virtual jitter buffers (compared to the 

the frame (e.g., 20 bytes for 5.3 Kbps G.723.1), while the computationally-desirable or real jitter buffer) based on FEC 

remainder of the frame contains a segmented voice signal schemes, the virtual jitter buffers may simulate FEC 

5856. Since redundancy is set to three, there are preferably schemes, ignoring any redundancies. Any frames recovered 

three frames 585 in the frame field 592c, namely the original 45 with this simulation process may contain arbitrary data, 

frame (frame n) being transmitted with data packet n, the since these recovered frames will not get played out by the 

frame (frame n-1) originally transmitted with the preceding virtual jitter buffers. 

packet n-1, and the frame (frame n-2) originally transmitted Each data packet time stamp enables receiver 510 to 

with the packet n-2 that preceded the packet n-1, as shown evaluate dynamic transporting characteristics of the trans- 

in FIG. 3. Thus, with a redundancy of three, not only is 50 porting network 535. These transporting characteristics 

frame n being transmitted in the frame field 592c of the data determine how the packetizer 590 packetizes the frames 585 

packet n, but so are the two frames (i.e., frame n-1 and and how receiver 510 unpacks these frames. More 

frame n-2) of the preceding data packets (i.e., packet n-1 preferably, transporting characteristics also determine 

and packet n-2). As a result, the receiver 510 has an extra whether packetizer 590 utilizes redundancy or an alternative 

copy of frame n-1 and frame n-2 to use in case either of the 55 error correction coding, such as FEC. Related U.S. patent 

data packets n-1 and n-2 were lost during transmission application Ser. No. 08/942,446, now U.S. Pat. No. 6,175, 

through the transporting network 535. 871, entitled "Method and Apparatus For Real Time Com- 

As discussed in more detail below, each of the jitter munication Over Packet Networks," describes a system for 

buffers in the buffer array 514 may use a different rcdun- communicating real time media over a non-guaranteed nel- 

dancy value than the other jitter buffers. In order to evaluate 60 work such as network 535 shown in FIG. 1. U.S. patent 

the performance of the jitter virtual buffers, (compared to the application Ser. No. 08/942,446, now U.S. Pat. No. 6,175, 

computationally-desirable or real jitter buffer), based on 871 has been entirely incorporated herein by reference and 

redundancy values, redundant frames (i.e., frame n-1 and/or the reader is directed to it for further details. Varying 

frame n-2) may be ignored by a virtual jitter buffer with a transporting characteristics of network 535 include such 

lower redundancy value, than the computationally-desirable 65 characteristics as the standard deviation of one-way delay or 

or real jitter buffer, or created by a virtual jitter buffer with the round trip time for each transported data packet, packet 

a higher redundancy value than the computationally- jitter, packet loss rates, and packet delay. 
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Packet delay may generally be determined from the variable, i, is less than or equal to the redundancy value in 

packet round trip time which is calculated by transporting a Step 1007. If so, the first frame (i.e., frame n) is unpacked 

copy of the time stamp value back to the sender 502, and from the data packet (i.e., data packet n) in Step 1008. 

comparing the received time of the copy with the time stamp According to Step 1009, for each virtual buffer with a 

value contained therein. This information may be sent by the 5 redundancy greater than or equal to the counter variable, i, 

receiver 510 to the sender 502 via feedback packet 520. The a frame is inserted into the buffer in Step 1010. In addition, 

standard deviation of one-way delay is typically approxi- the counter variable, i, is increased by one in Step 1011, and 

mated by averaging the absolute value of differences Step 1007 is repeated. 

between time stamp values and received times for each If the counter variable, i, is greater than the redundancy 

received packet. 30 value, the packet arrival function continues with Step 1012. 

Receiver 510 receives a sequence of data packets 511. For each virtual jitter buffer with a redundancy value greater 

This sequence of data packets 511 may vary from the than the redundancy value of the data packet, a simulation 

sequence of data packets 508 originally communicated to of the arrival of a further redundant frame is appended to the 

the transporting network 535. The variance between the two data packet, and inserted into the virtual jitter buffer in Step 

data packet sequences 508, 511 is a function of varying 1013. This process is repeated until the redundancy value of 

transporting characteristics. 35 the virtual jitter buffer is satisfied. The packet arrival func- 

Because preferred transporting network 535 is a non- tion then continues with Step 1014. For each virtual jitter 

guaranteed packet switched network, receiver 510 may buffer with FEC activated, a simulation of the arrival of FEC 

receive packets out of order vis-a-vis other data packets frames is appended to the data packet, to the extent 

comprising the originally transported packet sequence 508. necessary, and inserted into the virtual buffer in Step 1015. 

To mask this jittering of data packet stream 508, packetizer 2 ° Finally, the packet arrival function stops with Step 1016. 

590 adds sequence numbers to frames 585, as explained Preferably, but not necessarily, each jitter buffer also has 

above. Receiver 510 has a buffer array 514 that stores an associated FEC buffer with an FEC queue (not shown), 

relevant data for frames. As long as the sequence number of Each FEC buffer is a management buffer that is synchro- 

an arriving frame is greater than the sequence number of the nized in terms of sequence numbers and buffer depth with its 

frame being played out by decoder 518, the sequence 25 associated jitter buffer. The FEC queue of each FEC buffer 

number is used to put the unpacked frame at its correct is comprised of an indicator sequence of "0's" and "l's" 

sequential position in each of the jitter buffers 520. (rather than data packets or frames), such that the "0 V 

Therefore, the larger the jitter buffer size, the later a frame represent data packets that have not yet been received by the 

can arrive at receiver 510 and still be placed in a to-be- module 512, and the "l's" represent data packets that have 

played-out frame sequence. On the other hand, as jitter 30 been received by the module 512. Specifically, for each data 

buffer size increases, the larger the overall delay can be in packet that fails to arrive at the module 512 (i.e., is missing), 

transporting voice signals 583 from sender 502 to receiver a "0" may be inserted into the tail end of the FEC queue 

510. associated with each jitter buffer. In contrast, for each data 

Receiver 510 includes a buffer management module 512, packet that arrives at the module 512, a "1" may be inserted 

a buffer array 514, and decoder 518. Module 512 receives 35 into the tail end of the FEC queue associated with each jitter 

incoming data packet sequence 511. Initially, the module buffer, as shown in Step 1015 of FIG. 5. 

512 strips away the packet header and reads the data packets FEC buffers allow efficient determination of which data 

contained in the data packet stream 511. Module 512 then packets in a sequence have arrived, and depending on the 

unpacks the incoming data packet stream 511 and recovers FEC scheme being used, whether or not a given missing data 

the frames 585. The module 512 also extracts any error 40 packet can be reconstructed. For example, if packets n and 

correcting codes present in the data packet stream 511. If the n-2 have been received by the module 512, the FEC buffer 

module 512 finds any error correcting codes, the module 512 queue will consist of the string "101." The module 512 needs 

preferably decides if any lost frames can be recovered on only to look at the FEC buffer queue to determine that 

any of the jitter buffers, and inserts the recovered frames into packets n and n-2 have arrived, but that packet n-1 has not 

the appropriate jitter buffer or buffers. Alternatively, the jitter 45 arrived. If the first FEC scheme described above was being 

buffers themselves may be given the error correcting codes used by the module 512, and data packet n+1 is the next 

by the module 512 to decide individually whether or not any packet to arrive at the module 512, then the lost data packet 

lost frames can be recovered, and if so, the jitter buffers may n-1 can be reconstructed through FEC of data packet n+1. 

insert any recovered frames into themselves. It should be understood, however, that the module 512 may 

The various functions and routines performed by the 50 examine the actual jitter buffer queues, rather than their FEC 

buffer management module 512 are set forth in the detailed buffer queues, to determine which data packets in a sequence 

C++ language source code attached hereto at Appendix A have arrived, and whether or not missing data packets can be 

under the file name "buffer_mgmt.ee." The unpacking of reconstructed. In such an arrangement, the use and presence 

data packets and recovery of lost frames is represented in the of FEC buffers would not be necessary, 

source code of Appendix A by the function command of 55 Asset forth in the source code of Appendix A, the module 

"packet_arrival ( )." A flowchart of this packet arrival 512 is able to perform several other functions besides 

function 1000 is also shown in FIG. 5. The packet arrival unpacking data packets and recovering lost frames (i.e., the 

function starts in Step 1001 with the arrival of a data packet packet arrival function). For instance, the module 512 is 

at the module 512. In Step 1002, a determination is made of capable of accepting jitter buffer parameters from an admin- 

whether the data packet is the first packet (i.e., data packet 60 istrative input, such as a configuration file, as represented in 

n) of the conversation or talk spurt. If so, a playout timer is the source code of Appendix A by the function command of 

started in Step 1003, and then the sequence number from the (t read__vb„conf_file ( )." The configuration file may be 

packet is read in Step 1004. If not, Step 1003 is skipped. statically or dynamically set up to provide the module 512 

After reading the sequence number from the packet in Step with varying parameters for the jitter buffers, such as maxi- 

1004, the redundancy is read from the packet in Step 1005. 65 mum buffer length, playout, redundancy, and FEC variables. 

At this point, a counter variable, i, is set to one in Step An example of the types of parameters for sixteen jitter 

1006. Next, a determination is made of whether the counter buffers is attached hereto at Appendix B. 
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The buffer management module 512 is also capable of time period. The selected jitter buffer then acts as a "real" 

calling a timeout every o ms, where n is the frame length in jitter buffer for the given time period, thereby passing any 

time of the decoder 518 being used (i.e., 30 ms for G. 723.1, buffered frames to the decoder 518 for playout. It is only the 

10 ms for G729A, or a variable number of ms for G.711, computationally-desirable or real jitter buffer that forwards 

such as 10 ms, 15 ms, or 30 ms). This timeout is represented 5 frames to the decoder 518. For example, if jitter buffer 533 

in the source code of Appendix A by the function command is selected as the computationally-desirable or real jitter 

of "time_out ( )." In order to accomplish this timeout buffer during a certain time period, only buffer 533 passes 

function, the module 512 preferably receives a hard or soft frames 534 to decoder 518. The other jitter buffers (i.e., jitter 

interrupt from a system clock at regular intervals (i.e., every buffers 531 and 532) are merely virtual jitter buffers, and 

n ms), and passes the interrupt to each of the jitter buffers. Q thus do not playout frames to the decoder 518. 

An example of a system clock suitable for the present Each jitter buffer is capable of performing various func- 

invention is disclosed in U.S. patent application Ser. No. tions and routines as set forth in the detailed C++ language 

09/241,689, now U.S. Pat. No. 6,360,271, entitled "System source code attached hereto at Appendix C, under the file 

for Dynamic Jitter Buffer Management Based on Synchro- names "VoiceBuffer.ee," "VoiceBuffer.h ," and "Buffer. h." It 

nized Clocks," which has already been specifically incor- s should be understood that, according to standard C++ 

porated in its entirety herein by reference. Each timeout convention, the longer methods of the buffer objects are 

interrupt from the module 512 indicates to the defined in a C++ (,cc) file, while the simpler methods are 

computationally-desirable or real jitter buffer that a frame defined in a header (.h) file. In addition, it should also be 

should be played out (i.e., passed to the decoder 518). understood that, while the term voice buffer is used in the 

The module 512 is also capable of determining whether 20 source code of Appendix C instead of jitter buffer, the terms 

the jitter buffers are empty, and de-allocating and deleting voice buffer and jitter buffer are synonyms, and may be used 

(i.e., killing) the jitter buffers. These capabilities of the interchangeably throughout the present application and 

module 512 are represented in the source code of Appendix attached source codes. 

A by the function commands of "all_empty ( )" and Some of the variables used in the source code of Appendix 

"kill_buffers ()," respectively. In addition, the module 512 25 C include "fec_t fec_," "bool real," "int playout," "int 

is preferably capable of determining the performance of ticks_," and "int redundancy." The "fec_t fec_" variable 

each jitter buffer, as explained in more detail below. Also, represents the method of FEC to be used by the virtual 

the module 512 is preferably capable of resetting the buffers. The "bool real" variable is true if the jitter buffer is 

computationally-desirable or real jitter buffer at the end of real, and false if the jitter buffer is not real (i.e., virtual). In 

each talk spurt (i.e., during silence periods) to be the jitter 30 addition, the "int playout" variable represents the number of 

buffer with the best performance over the previous talk frame durations to wait until the first packet is played out, 

spurt. and the "int ticks_" variable represents the number of frame 

As stated above, frames 585 are passed by the module 512 durations elapsed since the arrival of the first packet. Also, 

to buffer array 514. Redundant frames are discarded and not the "int redundancy" variable represents the redundancy 

buffered if their original frames have been previously buff- 35 value used by the jitter buffers. 

ered. Preferably, buffer array 514 comprises a plurality of The functions of each jitter buffer will now be defined 

jitter buffers 516, each of which may be either a with reference to the source codes of Appendix C, as well as 

computationally-desirable (i.e., real) jitter buffer or a virtual the flow charts shown in FIGS. 6-10. Each jitter buffer is 

jitter buffer. Each jitter buffer receives a copy of the frames preferably capable of determining whether it is a real or 

585 from the module 512. Related U.S. patent application 40 virtual jitter buffer, and is also capable of reporting its 

Ser. No. 09/241,689, now U.S. Pat. No. 6360,271, entitled current status. This status reporting is represented in the 

"System for Dynamic Jitter Buffer Management Based on source codes of Appendix C by the function command of 

Synchronized Clocks," describes an exemplary management "Status ( )." Each jitter buffer is also capable of performing 

system for dynamically jitter buffering a sequence of a time out function 1200, as shown in the flow chart of FIG. 

packets, and the reader is directed to this application for 45 6. This time out function 1200 is represented in the source 

further details. codes of Appendix C by the function command of "TimeOut 

Module 512 reads the sequence number and the lime ( )." As shown in FIG. 6, the time out function 1200 starts 

stamp of a current frame. Redundant frames associated with with Step 1201. In Step 1202, a determination is made of 

the current frame have the same time stamp as the current whether the number of frame durations elapsed since the 

frame since, within a given packet, redundant and current 50 arrival of the first packet (i.e., ticks_) is less than the number 

frames were both originally communicated from the pack- of frame durations necessary for the first packet to be played 

etizer 590 at approximately the same point in time. Since the out (i.e., playout). If so, the time out function 1200 increases 

order or sequence of the redundant frames is known, the the number of frame durations elapsed since the arrival of 

redundant frame sequence numbers can be inferred from the the first packet (i.e., ticks_) by one, and the time out 

current frame sequence number, 55 function 1200 stops in Step 1204. Otherwise, a frame is 

In an exemplary embodiment, the buffer array 514 com- removed from the front or head of the jitter buffer's queue 
prises three jitter buffers, namely jitter buffers 531, 532, and in Step 1205. Step 1205 is represented in the source codes 
533, It should be understood, however, that any desirable of Appendix C by the function command of "GetFrame ( )." 
number of jitter buffers may be used (i.e., 2, 16, 100, etc.), Next, in Step 1206, the removed frame is passed to the play 
depending on operating and user preferences. As set forth 60 frame function, which is described below, and the time out 
above, each jitter buffer 531, 532, 533 preferably has a frame function 1200 stops with Step 1204. 
queue for buffering or storing frames, and preferably, but not The play frame function 1300 is represented in the source 
necessarily, an FEC buffer for error correction coding, such codes of Appendix C by the function command of "Play- 
as FEC. As will be discussed, each jitter buffer has an Frame ()," and is illustrated by the flowchart shown in FIG. 
associated jitter buffer quality, which is periodically, inde- 65 7. The play frame function 1300 starts with Step 1301. A 
pendently evaluated. Based on this evaluation, a determination is then made in Step 1302 of whether the 
computationally-desirable jitter buffer is selected for a given frame provided by the time out function 1200 is empty or 
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missing. If so, a further determination is made of whether the of the frame at the head of the queue. If so, similar to Step 

jitter buffer is real in Step 1303. If so, a silence frame (i.e., 1506, a further determination is made of whether there are 

an SID) is played by the real jitter buffer in Step 1304, and any missing frames between the current frame and the frame 

the play frame function 1300 stops in Step 1305, If the jitter at the head of the queue, in Step 1511. Any missing frames 

buffer is not real, however, then a silence frame is not played 5 are inserted in Step 1507. Otherwise, the insert function 

(i.e., Step 1304 is skipped), and the play frame function 1300 1500 proceeds with Step 1508. If the current frame's 

simply stops with Step 1305. sequence number was greater than the sequence number of 

If the frame provided by the time out function 1200 is not the frame at the head of the queue in Step 1510, the insert 

empty or missing, then the play frame function 1300 con- function 1500 continues with Step 1512. In Step 1512, a 

tinues with Step 1306. Similar to Step 1303, in Step 1306, 10 determination is made of whether the current frame's 

a determination is made of whether the jitter buffer is real. sequence number is greater than the sequence number of the 

If so, the frame is played by the real jitter buffer and sent to frame at the tail of the q ueue - If so > simiIar t0 Ste P s 1506 and 

the decoder 518, in Step 1307. If the jitter buffer is not real, 1S }\ a te**r determination is made in Step 1513 of 

however, then the play frame function 1300 stops with Step ? hether J™ * rc ^ m t f?^T between A the CUrrcnt 

1305. It should be understood that regardless of whether the 15 * ame Md the ^ ame K al / h he tai1 f the . que f ' m f* n S 

. ~ . , , , ft u *u c 15 frames are inserted by the insert missing function of Step 

jitter buffer is real or not and regardless of whether a frame ^ 0therwi th / insert * oceeds ^ ^ 

is played out, statistics for the jitter buffer are preferably 

recorded for performance evaluation, as discussed in more . c , . , . , 

1 -i kZi h If the current frame s sequence number, however, is less 

detail below. man the 5^^^ number of the frame at the tail of the 

Another function performed by the jitter buffers is the 20 queue in Step 1512> tnen a final determination is preferably 

arrival function 1400, which is represented in the source ma <j e in Step 1514 of whether the current frame's sequence 

codes of Appendix C by the function command of "Arrival number is equal to the sequence number of a frame that is 

( )," and is illustrated by the flow chart shown in FIG. 8. The marked as missing in the queue. If so, then the frame marked 

arrival function 1400 starts with Step 1401 and continues as missing in the queue is replaced in Step 1515 with the 

with Step 1402, where a determination is made of whether 2 s current frame. If not, the insert function 1500 discards the 

an arriving frame's sequence number is less than or equal to current frame in Step 1503, and stops with Step 1504. 

the sequence number of the last frame played by the jitter In other words, there are preferably five cases with which 

buffer. If so, the frame is deleted in Step 1403, and the arrival a current frame provided by the arrival function 1400 (see 

function stops in Step 1404. On the other hand, if the Step 1405) will be handled by the insert function 1500. In 

arriving frame's sequence number is greater than the 30 the first case (see Steps 1502 and 1503), if the queue of a 

sequence number of the last frame played, then the arriving jitter buffer is full, the current frame is discarded. In the 

frame is placed into the queue of the jitter buffer in Step second case (see Steps 1505 and 1506), if the queue is 

1405. Step 1405 may also be referred to as an insert em P tv > and if there are missin g frames between the current 

function, which is described in more detail below. After Step &j™ and tne fra ™ last P la y ed b V lhe P la / fram f fun <: tion 

1405, the arrival function stops with Step 1404. 35 13 °°; ib ? m *T g . r* ^ ^1 S f 

* . n . +- nn r JL .u u • insert missing function (i.e., Step 1507) described below. In 

The insert function 1500 performed by the jitter buffers is addilio0) if th fa cre ^ ^ ) Qom in ^ Q ^ the currem frame 

represented in the source codes of Appendix C by the ^ ^ {QScncd (scc Step 15U) In the thifd case (see Step 

function command of "Insert ( )," and is illustrated by the 1510 ) if there is room in the queue of the jitter bufferf the 

flow chart shown in FIG. 9. The insert function 1500 starts curren t frame is also appended to the front of the queue. 

with Step 1501 and proceeds with a determination of 40 Additionally, if there are any missing frames between the 

whether the queue of the jitter buffer is full in Step 1502. If current frame and the frame at the front of the queue (see 

the queue is full, the frame provided by the arrival function Step 1511), the missing frames are preferably inserted by the 

1400 (see Step 1405) is discarded by the jitter buffer in Step insert missing function described below. In the fourth case 

1503, and the insert function 1500 stops in Step 1504. If the (see Step 1512), if there is room in the queue of the jitter 

queue is not full, however, a further determination is made 45 buffer, the current frame is appended to the tail of the queue. 

in Step 1505 of whether the queue is empty. If the queue is Once again, if there are missing frames between the current 

empty, yet another determination is made in Step 1506 of frame and the frame at the tail of the queue (see Step 1513), 

whether there are any missing frames between the current the missing frames are preferably inserted with the insert 

frame from the arrival function 1400 and the last frame missing function described below. Finally, in the fifth case 

played by the play frame function 1300. If so, the missing 50 (see Step 1514), a frame already in the queue that is marked 

frames are inserted by an insert missing function in Step as missing is preferably replaced by an equivalent current 

1507. The insert missing function is represented in the frame. It should be noted that if the current frame does not 

source codes of Appendix C by the function command of fit into one of the above five cases, then the current frame 

"InsertMissing ( )," and is described in more detail below. must be a duplicate of an already existing frame, thereby 

After the missing frames are inserted in Step 1507, another 55 resulting in the current frame being discarded by the jitter 

determination of whether or not the queue is full is made in buffer. 

Step 1508. If the queue is full, the current frame is discarded As shown in FIG. 10, the insert missing function 1600 

in Step 1503. Otherwise, the current frame is placed in the starts with Step 1601. A determination is then made in Step 

queue in Step 1509. After discarding or inserting the current 1602 of whether the missing frames are to be inserted at the 

frame, the insert function 1500 then stops with Step 1504, as eo tail of the queue. If so, the missing frames are inserted in 

shown in FIG. 9. If there weren't any missing frames proper sequential order at the tail of the queue in Step 1603. 

between the current frame and the last frame played (see The missing frames may only be inserted at the tail of the 

Step 1506), then the insert function skips 1507 and proceeds queue, however, to the extent the queue is empty. Moreover, 

directly to Step 1508. for each missing frame inserted, a "zero" is preferably 

If the queue was not empty in Step 1505, then another 65 inserted at the tail of the FEC queue of the jitter buffer in 

determination is made in Step 1510 of whether the current Step 1604. The insert missing function 1600 then stops in 

frame's sequence number is less than the sequence number Step 1605. 
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If the missing frames are not to be inserted into the tail of 
the queue, a determination is made in Step 1606 of whether 
the missing frames are to be inserted at the head of the 
queue. If so, the missing frames are inserted, preferably in 
reverse sequential order, at the head of the queue in Step 
1607. Once again, missing frames axe only inserted in Step 
1607 to the extent that the queue is empty. After Step 1607 
is completed, the insert missing function 1600 stops with 
Step 1605. 

If the frames are not to be inserted either at the tail of the 
queue or at the head of the queue, the insert missing function 
1600 continues with Step 1608, where a determination is 
made of whether the frames are to be inserted into any other 
portion (i.e., between the tail and the head) of the queue. If 
so, to the extent that the queue is empty, the missing frames 
are inserted in sequential order into the queue in Step 1609. 
Similar to Step 1604, for each missing frame inserted, a 
"zero" is inserted at the tail end of the FEC queue in Step 
1610. Next, the insert missing function 1600 stops with Step 
1605. The insert missing function 1600 also stops with Step 
1605 if the missing frames were not to be inserted into the 
queue according to Step 1608. It should be noted that if the 
queue becomes full and there are still more missing frames 
to insert, the remaining missing frames are preferably dis- 
carded. 

Decoder 518 decompresses the forwarded frames 534. 
Decompressed frames 563 are then forwarded to a digital- 
to-analog (D/A) converter 520. D/A converter 520 converts 
the digital frames 563 to an analog output 565. Analog 
output 565 represents original analog input 572 generated by 
the first calling device 570. Analog output 565 is forwarded 
to a listening device 522 for playout. 

As previously mentioned, exemplary buffer array 514 
includes a plurality of jitter buffers 531, 532, and 533. Each 
jitter buffer 531, 532, 533 receives unpacked frames from 
module 512. Preferably the jitter buffers 531,532, 533 have 
various associated jitter buffer values. Such values may 
include by way of example, and without limitation, operat- 
ing characteristics such as steady-state buffer playout depth, 
maximum buffer depth, redundancy coding, and/or FEC 
coding. 

Periodically, the jitter buffer values are evaluated. To 
mask the varying transporting nature of medium 535, the 
performance of each individual jitter buffer in array 514 is 
evaluated according to certain system sensitivities. The jitter 
buffer that results in computationally-desirable transporting 
characteristics vis-a-vis the sensitivities during a specific 
period of time is selected as the computationally-desirable or 
real jitter buffer. The remaining jitter buffers are used as 
virtual buffers. It is only the computationally-desirable or 
real jitter buffer that forwards frames to the decoder for 
playout of real time input, preferably during a subsequent 
talk spurt or burst. A talk spurt or burst, as that term is herein 
used, means the time period extending between two succes- 
sive silence periods. 

The performance of each individual jitter buffer is peri- 
odically or intermittently evaluated. Preferably, where call- 
ing device 570 and listening device 522 are engaged in a 
telephone conversation, each jitter buffer is evaluated at the 
end of a talk spurt, or when a silence is detected by the 
receiver, i.e., when a SID frame or a predetermined number 
of sequential SID frames are received. Therefore, during an 
interactive real lime media session, various jitter buffers of 
array 514 may be selected at varying times to act as the 
computationally-desirable or real jitter buffer, or alterna- 
tively to serve as a virtual jitter buffer. 
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Independent jitter buffer evaluation may be explained by 
way of example. By way of example, and without limitation, 
the array of jitter buffers 514 may be evaluated at time t=N. 
Based on this evaluation, it may be determined that jitter 

5 buffer 533 is the computationally-desirable or real jitter 
buffer, and therefore forwards its buffered frames to decoder 
518 for playout during the next talk spurt. In contrast, jitter 
buffers 531 and 532 are virtual jitter buffers that do not 
forward their buffered frames to decoder 518 for playout. 

10 Assume further that, at t«N+l, jitter buffers 516 are once 
again evaluated, and it is determined that jitter buffer 531 is 
now the computationally -desirable or real jitter buffer. The 
previous computationally-desirable or real jitter buffer 533 
no longer forwards frames to decoder 518 (i.e., jitter buffer 

is 533 is now only a virtual buffer). Now, the new 
computationally-desirable jitter buffer 531 acts as the real 
jitter buffer and forwards its buffered frames to the decoder 
518. 

One advantage of periodic independent jitter buffer cvalu- 

20 ation is that it allows a large number of buffer parameters to 
be compared and evaluated. Another advantage of individual 
evaluation is that it allows various error correcting codes 
and/or packet redundancy methods to be compared and 
evaluated. By periodic independent jitter buffer comparison 

25 and evaluation, receiver 516 may dynamically respond to a 
range of network transporting characteristics. 

Another advantage of such a system is that receiver 510 
may dynamically adjust to potential user definable operating 
requirements. For example, a user may desire certain system 

30 operation parameters such as error correction coding, 
redundancy, or bandwidth limitations upon communication 
channel 500. If such operating conditions are desired, 
receiver 510 may take these desired conditions into account 
during jitter buffer evaluation. Such operating requirements 

35 may also be imposed via management software, such as a 
configuration file, as discussed above. It should also be 
understood that the operation conditions may be purposely 
non-optimal, depending on desired operating characteristics 
and user preferences. For instance, a user or organization 

40 may desire FEC for all voice streams to improve accuracy, 
despite the fact that it may result in an increased and 
non-optimal use of bandwidth. 
Jitter buffer array 514 may be evaluated in accordance 

45 with a number of different operating parameters that result 
in certain system sensitivities. Various parameters may be 
associated with each jitter buffer. Associated parameters 
may include by way of example, and without limitation, the 
steady-state jitter buffer playout depth, maximum jitter 

5(J buffer depth. Other parameters could include whether a jitter 
buffer implements redundancy coding and/or FEC coding. 

As the term is used herein, the term steady state jitter 
buffer playout depth means the number of frames that a jitter 
buffer tries to maintain in a packet queue. Maximum virtual 

55 buffer depth means the maximum jitter buffer frame size. As 
previously described, the term redundancy refers to the 
number of previous frames packed into a data packet with 
the current frame. Moreover, the FEC coding may include 
any desirable scheme, such as the first and second FEC 

60 schemes described above. 

Essentially, jitter buffer evaluation is a tradeoff between 
packet delay (i.e., buffer depth), packet loss, and bandwidth. 
The array of buffers may be evaluated according to admin- 
istratively chosen sensitivities. Preferably, jitter buffers are 

65 evaluated based on sensitivities related to packet delay, 
packet loss, and bandwidth. Alternatively, jitter buffers may 
be evaluated according to other sensitivities, such as packet 
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variance. By evaluating the various jitter buffers based on 
such sensitivities, receiver 510 may dynamically adjust to a 
various range of communication channel and user require- 
ments. 

Preferably, the periodic jitter buffer evaluation refers to a 
number of packets or an amount of time large enough for the 
collected statistics to provide a meaningful evaluation. The 
length of an evaluation period is dependant upon the voice 
coder-decoder (codec) implemented. In addition, the longer 
the evaluation period, the more data present for making a 
decision as to the performance of the jitter buffers. A longer 
evaluation period with more data results in a better decision 
as to which jitter buffer is performing the best. However, the 
longer the evaluation period, the longer the wait for a 
computationally-desirable jitter buffer to be selected, and the 
longer a user may have to endure a lower quality conver- 
sation. Accordingly, the length or duration of an evaluation 
period should be long enough to obtain a meaningful deci- 
sion as to the performance of the jitter buffers, but short 
enough to enable frequent selection of the computationally- 
desirable jitter buffer. Preferably, the evaluation period is as 
least as long as a round trip packet delay for a given 
conversation, but no longer than twenty round trip packet 
delays. 

Where input is a telephone conversation, the length of an 
evaluation period may depend on talk spurt dynamics. In 
other words, the jitter buffers are evaluated once receiver 
510 detects a SID frame or a threshold number of sequential 
SID frames. In general, where real time media input is a 
telephone conversation, buffers may be evaluated during the 
silence between talk spurts. 

Preferably, jitter buffer values are evaluated as to how the 
values effect the overall system sensitivity, which is predi- 
cated on sensitivity settings. Preferably, sensitivity settings 
may include packet delay (Sd), packet loss (SI), and band- 
width (Sb). These sensitivities may be static or dynamic, 
depending on administrative and user preferences, and may 
be user definable or determined by a network management 
software, such as a configuration file (discussed above). 
While static settings are relatively quick and simple, 
dynamic settings may allow feedback and adjustments to 
changing network conditions. 

FIG. 11 provides an exemplary implementation of three 
sensitivity settings 580, 582 and 584 for the communication 
channel 500 shown in FIG. 1. In this exemplary 
implementation, only three sensitivity settings are provided. 
Alternatively, a greater or lesser number of settings may be 
selected according to the desired overall system sensitivity. 
Sensitivity is represented as a slide bar that ranges in value 
from 0 to 1 on a continuous scale. With such a scale, the 
closer to 1 the sensitivity setting is, the more sensitive the 
evaluation scheme will be to the user defined sensitivity. The 
closer to 0 the sensitivity setting is, the less sensitive the 
evaluation scheme is to that particular parameter. 
Consequently, the exemplary implementation provided in 
FIG. 11 is generally sensitive to delay, moderately sensitive 
to bandwidth, and generally insensitive to packet loss. 

Where system sensitivities are determined by software, 
some knowledge of the voice codec being used and the 
current state of the network may be required. For example, 
a G.711 codec requires more bandwidth than a G. 723.1 
codec. Likewise, a short codec, such as a G.729A codec, 
with 10 ms frames, is three times as sensitive to jitter and 
packet variance as a longer codec, such as a G. 723.1 codec, 
with 30 ms frames. Such information about the voice codec 
being used and the current state of the network may be 
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recorded and provided by the buffer management module. 
Assuming that this information is available, bandwidth 
sensitivity may be set based on current network load, the 
sensitivity to loss could be based on the ability of the voice 
5 codec to conceal lost packets, and the delay sensitivity could 
be based on the end-to-end latency of the current voice path 
through the network, as well as the frame size of the codec. 

In the exemplary embodiment illustrated in FIG. 1, each 
jitter buffer computes and maintains its independent values 

10 that may include the mean buffer depth (d), the mean loss 
rate (1), and the required bandwidth (b). The mean buffer 
depth (d) is determined from sampling. A sampling rate may 
be periodic or intermittent with its period set equal to a 
constant or varying frame duration. Preferably, the sampling 

15 rate is continuously varied by basing its period on a diversity 
of randomly generated values. Although additional hardware 
and/or software may be required, the buffer depth (d) may 
also be sampled according to a Poisson-like process. As 
known in the art, a Poisson process is a method for per- 

20 forming unbiased sampling based on well-known math- 
ematical principles. In any event, a varied sampling rate is 
preferred since it is less sensitive to system periodicity and 
produces less biased samples. 

25 Mean packet loss (1) may be periodically determined by 
counting packet losses and packet arrivals, preferably over 
a duration of a current talk spurt. The bandwidth (b) require- 
ment is generally a constant, since this requirement is based 
on a per-buffer basis. In other words, different buffers will 

30 have different bandwidth requirements, since the buffers 
comprising the buffer array may have different 
characteristics, such as redundancy or FEC coding. 

Preferably, the buffer depth (d), mean packet loss (1), and 
bandwidth (b) values are normalized from 0 and 1. 
35 Consequently, the smaller the normalized value, the greater 
the end-user and therefore system quality. For example, 
since the mean packet loss rate (1) already takes on values 
between 0 and 1, a lower value is more desirable in terms of 
end-user quality. 

40 

Buffer depth (d) and bandwidth (b) are normalized with 
respect to the largest value that the buffer can take on in a 
given receiver. For buffer depth, the greatest depth that any 
jitter buffer is programmed to maintain at steady state, and 
45 divide each jitter buffer's mean depth by this steady state 
number. 

For bandwidth (b), the greatest degree of per-frame over- 
head (not including packet headers) used by any virtual 
buffer is determined. The per-frame overhead of each virtual 
50 buffer is divided by this number. 

The following example provides an illustration of how the 
jitter buffer values are normalized and is not intended as a 
limitation. Assume that an IP telephony receiver, such as the 

55 receiver 510 shown in FIG. 1, includes a buffer array 
comprising a set of jitter buffers. Further assume that the 
greatest steady-state buffer depth used by any jitter buffer of 
the array is 6 frames. Also assume that, in the context of this 
example, the greatest redundancy used by any of the jitter 

„ buffers is 3 frames. 

60 

Now assume that, after a certain period of time, the jitter 
buffers have computed their independent values. Further 
assume that, during this period of time, one of the jitter 
buffers had a mean depth of 3 frames, had a packet loss rate 
65 of 5%, and had a bandwidth of 2. For this specific jitter 
buffer, then, the following jitter buffer values may be com- 
puted as follows: 
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Buffer depth«(d)=(mean depth of 3)/(greatest steady state 

depth 6)*0.5 
Mean loss rate«=(l)»5% or 0.05 

Bandwidth«(b)-(buffer per-frame overhead 2)/(greatest 
degree of per-frame overhead 3)*0.66 

These values are then computed for each jitter buffer. 
These values are then used to compute a jitter buffer quality 
(q) for each buffer. The relative jitter buffer quality provides 
a general measure of the quality of service a jitter buffer 
would have provided during a previous period of time. By 
way of example, and without limitation, an exemplary jitter 
buffer quality may take into account one or more jitter buffer 
values, as well as one or more system sensitivities. An 
exemplary jitter buffer quality may be represented by the 
equation: 

Returning to the sensitivity settings of the example illus- 
trated in FIG. 11, the sensitivity settings may be approxi- 
mated as S rf -1, S,-0, and S fc «0.5. Based on these sensitivity 
settings, the hypothetical jitter buffer would result in the 
following jitter buffer quality: 

9-(l)*(0.5)+(0)*(0.05)+(0.5) (0.66)-0.83 

The q value for the remaining jitter buffers would also be 
computed. Based on the resulting quality computations, the 
computationally-desirable or real jitter buffer is chosen to be 
the jitter buffer generating the smallest value for q. 

In a preferred embodiment, jitter buffer delay and loss 
characteristics are calculated on a per talk spurt or burst 
basis. Preferably, a determination of jitter buffer perfor- 
mance is made during each silence period following a talk 
spurt or burst. A potential disadvantage of this type of 
evaluation is that it may lead to oscillation between one or 
more jitter buffers if, for example, the network behavior 
changes dramatically during periodic buffer evaluation. In 
order to smooth out the effects of transient conditions, the 
current delay and loss characteristics may be computed 
using an exponentially weighted moving average (EWMA) 
over a window of the last n talk spurts. 

In a preferred embodiment, jitter buffer evaluation uses 
first-order statistics for packet delay and loss. Alternatively, 
second-order statistics, such as the variation of delay and 
loss, may also be computed. A selection scheme using this 
information could attempt to minimize these second-order 
statistics, as well as the first-order statistics. Second -order 
statistics are well-known in the art, and are usually based on 
the second moment of a data set. While variance is com- 
monly used in second order statistics, mean absolute devia- 
tion may also be used, especially since mean absolute 
deviation is relatively easy to calculate in real time. Inter- 
quartile range is another second-order statistic that may be 
computed and used. 

By monitoring various transporting characteristics of the 
transporting network 535, communication channel 500 
offers a number of advantages. For example, the transmitter 
502 and the receiver 510 periodically adapts to varying 
transporting dynamics and conditions of the transporting 
network 535. For a non-guaranteed packet switched 
network, the network transporting dynamics may be 
assessed by way of the jitter buffer array by evaluating 
various transporting characteristics such as the packet delay 
distribution, error correction coding, and packet loss 
percentage, all of which may or may not be implemented at 
a given period of packet transportation. 

It should be readily apparent from the forgoing descrip- 
tion and accompanying drawings that the present invention 
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overcomes the problems associated with the prior art, espe- 
cially the problems identified above in the background 
section. It should also be understood that the evaluation and 
sensitivities of the present invention for buffer performance 

5 may be used with other buffer systems not described herein. 
In addition, it should further be understood that other 
evaluations and sensitivities for buffer performance not 
described herein may be used with the buffer system of the 
present invention. 

10 Those skilled in the art to which the invention pertains 
may make modifications in other embodiments employing 
the principles of this invention without departing from its 
spirit or essential characteristics, particularly upon consid- 
ering the foregoing teachings. Accordingly, the described 

15 embodiments are to be considered in all respects only as 
illustrative, and not restrictive, and the scope of the inven- 
tion is, therefore, indicated by the appended claims rather 
than by the foregoing description. Consequently, while the 
invention has been described with reference to particular 

20 embodiments, modifications of structure, sequence, materi- 
als and the like would be apparent to those skilled in the art, 
yet still fall within the scope of the invention. 
We claim: 

1. A gateway for receiving a transported stream of data 
25 packets comprising: 

a buffer management device receiving the data packets, 
unpacking the data packets and forwarding a stream of 
data frames; 

a first jitter buffer receiving the data frames from the 
30 buffer management device and buffering the data 
frames; 

a second jitter buffer receiving the data frames from the 
buffer management device and buffering the data 
35 frames; 

a computationally-desirable jitter buffer selected from the 
first jitter buffer or the second jitter buffer, the 
computationally-desirable jitter buffer selected by com- 
paring a first jitter buffer quality and a second jitter 
40 buffer quality; and 

a decoder receiving buffered data frames from the 
computationally-desirable jitter buffer. 

2. The invention of claim 1 wherein the buffer manage- 
ment device uses one of redundancy and forward error 

45 correction coding to recover lost data packets. 

3. The invention of claim 1 wherein the jitter buffer 
quality comprises at least one jitter buffer value and at least 
one receiver sensitivity setting. 

4. The invention of claim 3 wherein the jitter buffer value 
50 is selected from a group including mean buffer depth, mean 

loss rate, and bandwidth. 

5. The invention of claim 3 wherein the receiver sensi- 
tivity setting is selected from the group including packet 
delay, packet loss, and bandwidth. 

55 6. The invention of claim 1 wherein the decoder plays out 
the data frames received from the computationally-desirable 
jitter buffer. 

7. The invention of claim 1 wherein the jitter buffer 
quality is user definable. 
60 8. The invention of claim 1 wherein the computationally- 
desirable jitter buffer is periodically selected. 

9. The invention of claim 3 wherein the sensitivity setting 
is user definable. 

10. The invention of claim 1 wherein the jitter buffer 
65 quality is used for evaluating a dynamic characteristic of a 

transporting medium that transports the data packets from an 
encoding device to the gateway. 
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11. The invention of claim 1 wherein the jitter buffer 
quality is computed using an exponentially weighted mov- 
ing average. 

12. The invention of claim 11 wherein the exponentially 
weighted moving average is computed over a plurality of 5 
time periods. 

13. The invention of claim 1 wherein the jitter buffer 
quality is computed using a second-order statistical analysis. 

14. The invention of claim 13 wherein the second-order 
statistical analysis is a Poisson process. 10 

15. A method for receiving a transported stream of data 
packets comprising the steps of: 

receiving the data packets at a management module; 
unpacking the data packets at the management module; 
forwarding a first stream of data frames to a first jitter 
buffer; 

forwarding a second stream of data frames to a second 
jitter buffer; 

buffering the data frames at the first jitter buffer and the 20 

second jitter buffer; 
computing a first jitter buffer quality for the first jitter 

buffer and a second jitter buffer quality for the second ■ 

jitter buffer; 

selecting either the first or the second jitter buffer as a 25 
computationally-desirable jitter buffer based on the first 
and second jitter buffer qualities; and 

forwarding the buffered data frames from the 
computationally-desirable buffer to a decoder. 3Q 

16. The invention of claim 15 wherein the jitter buffer 
quality comprises at least one jitter buffer value and at least 
one receiver sensitivity setting. 

17. The invention of claim 16 wherein the jitter buffer 
value is selected from a group including mean buffer depth, 35 
mean loss rate, and bandwidth. 

18. The invention of claim 16 wherein the receiver 
sensitivity setting is selected from a group including packet 
delay, packet loss, and bandwidth. 
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19. The invention of claim 15 further comprising the step 
of playing out the data frames received from the 
computationally-desirable jitter buffer. 

20. The invention of claim 15 wherein the jitter buffer 
quality is user definable. 

21. The invention of claim 15 further comprising the step 
of periodically selecting the computationally-desirable jitter 
buffer. 

22. The invention of claim 16 wherein the sensitivity 
setting is user definable. 

23. The invention of claim 15 further comprising the step 
of evaluating dynamic characteristics of a network that 
transports the data packets from an encoding device to the 
gateway. 

24. The invention of claim 15 further comprising the step 
of computing the jitter buffer quality as an exponentially 
weighted moving average. 

25. The invention of claim 24 further comprising the step 
of computing the exponentially weighted moving average 
over a plurality of time periods. 

26. The invention of claim 15 wherein the jitter buffer 
quality is computed using a second-order statistical analysis. 

27. The invention of claim 15 further comprising the step 
of simulating the arrival of further redundancy frames for 
each jitter buffer with a redundancy value greater than a 
redundancy value of the data packets. 

28. The invention of claim 15 further comprising the step 
of simulating the arrival of forward error correction frames 
for each jitter buffer with an forward error correction 
scheme. 

29. The invention of claim 15 further comprising the step 
of forwarding a silence frame from the computationally- 
desirable buffer to the decoder when there is no buffered data 
frame present in the computationally-desirable buffer. 

30. The invention of claim 15 further comprising the steps 
of recovering missing data frames and inserting the missing 
data frames into at least one of the jitter buffers. 

31. The invention of claim 26 wherein the second -order 
statistical analysis is a Poisson process. 

***** 
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